Search CORE

248 research outputs found

Are we overestimating the number of cell-cycling genes? The impact of background models for time series data.

Author: Futschik Matthias
Herzel H.
Publication venue: Gesellschaft für Informatik e. V.
Publication date: 08/02/2021
Field of study

Periodic processes play fundamental roles in organisms. Prominent examples are the cell cycle and the circadian clock. Microarray array technology has enabled us to screen complete sets of transcripts for possible association with such fundamental periodic processes on a system-wide level. Frequently, quite a large number of genes has been detected as periodically expressed. However, the small overlap of identified genes between different studies has shaded considerable doubts about the reliability of the detected periodic expression. In this study, we show that a major reason for the lacking agreement is the use of an inadequate background model for the determination of significance. We demonstrate that the choice of background model has considerable impact on the statistical significance of periodic expression. For illustration, we reanalyzed two microarray studies of the yeast cell cycle. Our evaluation strongly indicates that the results of previous analyses might have been overoptimistic and that the use of more suitable background model promises to give more realistic resultsinfo:eu-repo/semantics/publishedVersio

Sapientia

Entropy and Long range correlations in literary English

Author: Ebeling W
Ebeling W
Herzel H
Herzel H
Hilberg W
Li W
Nicolis J S
Schmitt A
Shannon C E
Stanley H E
T Pöschel
W Ebeling
Publication venue: 'IOP Publishing'
Publication date: 15/09/1993
Field of study

Recently long range correlations were detected in nucleotide sequences and in human writings by several authors. We undertake here a systematic investigation of two books, Moby Dick by H. Melville and Grimm's tales, with respect to the existence of long range correlations. The analysis is based on the calculation of entropy like quantities as the mutual information for pairs of letters and the entropy, the mean uncertainty, per letter. We further estimate the number of different subwords of a given length

n

. Filtering out the contributions due to the effects of the finite length of the texts, we find correlations ranging to a few hundred letters. Scaling laws for the mutual information (decay with a power law), for the entropy per letter (decay with the inverse square root of

n

) and for the word numbers (stretched exponential growth with

n

and with a power law of the text length) were found.Comment: 8 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Statistical analysis of the DNA sequence of human chromosome 22

Author: Grosse I.
Herzel H.
Holste D.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/10/2001
Field of study

We study statistical patterns in the DNA sequence of human chromosome 22, the first completely sequenced human chromosome. We find that (i) the 33.4 x 10(6) nucleotide long human chromosome exhibits long-range power-law correlations over more than four orders of magnitude, (ii) the entropies H-n of the frequency distribution of oligonucleotides of length n (n-mers) grow sublinearly with increasing n, indicating the presence of higher-order correlations for all of the studied lengths 1 less than or equal to n less than or equal to 10, and (iii) the generalized entropies H-n(q) of n-mers decrease monotonically with increasing q and the decay of H-n(q) with q becomes steeper with increasing n less than or equal to 10, indicating that the frequency distribution of oligonucleotides becomes increasingly nonuniform as the length n increases. We investigate to what degree known biological features may explain the observed statistical patterns. We find that (iv) the presence of interspersed repeats may cause the sublinear increase of H-n with n, and that (v) the presence of monomeric tandem repeats as well as the suppression of CG dinucleotides may cause the observed decay of H-n(q) with q

Cold Spring Harbor Laboratory Institutional Repository

Spreading and shortest paths in systems with sparse long-range connections

Author: Cristian F. Moukarzel
D. J. Watts
H. Herzel
M. Barthélémy
S. A. Pandit
Publication venue: 'American Physical Society (APS)'
Publication date: 21/05/1999
Field of study

Spreading according to simple rules (e.g. of fire or diseases), and shortest-path distances are studied on d-dimensional systems with a small density p per site of long-range connections (``Small-World'' lattices). The volume V(t) covered by the spreading quantity on an infinite system is exactly calculated in all dimensions. We find that V(t) grows initially as t^d/d for t>t^*$, generalizing a previous result in one dimension. Using the properties of V(t), the average shortest-path distance \ell(r) can be calculated as a function of Euclidean distance r. It is found that \ell(r) = r for r<r_c=(2p \Gamma_d (d-1)!)^{-1/d} log(2p \Gamma_d L^d), and \ell(r) = r_c for r>r_c. The characteristic length r_c, which governs the behavior of shortest-path lengths, diverges with system size for all p>0. Therefore the mean separation s \sim p^{-1/d} between shortcut-ends is not a relevant internal length-scale for shortest-path lengths. We notice however that the globally averaged shortest-path length, divided by L, is a function of L/s only.Comment: 4 pages, 1 eps fig. Uses psfi

arXiv.org e-Print Archive

Crossref

Bias Analysis in Entropy Estimation

Author: Abramowitz M
Grassberger P
Harris B
Herzel H
Miller G
Shannon C E
Thomas Schürmann
Publication venue: 'IOP Publishing'
Publication date: 01/01/2004
Field of study

We consider the problem of finite sample corrections for entropy estimation. New estimates of the Shannon entropy are proposed and their systematic error (the bias) is computed analytically. We find that our results cover correction formulas of current entropy estimates recently discussed in literature. The trade-off between bias reduction and the increase of the corresponding statistical error is analyzed.Comment: 5 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Statistics of finite-time Lyapunov exponents in the Ulam map

Author: A. Prasad
A.M. Batista
A.M. Batista
C. Anteneodo
C. Ziehmann
F.M. Cucchietti
G. Boffetta
H. Fujisaka
H.-P. Herzel
H.-P. Herzel
J. Theiler
J.-P. Eckmann
J.C. Vallejo
K. Nam
S. Dawson
S. Grossmann
T. Nagashima
V.I. Oseledec
Publication venue: 'American Physical Society (APS)'
Publication date: 31/10/2003
Field of study

The statistical properties of finite-time Lyapunov exponents at the Ulam point of the logistic map are investigated. The exact analytical expression for the autocorrelation function of one-step Lyapunov exponents is obtained, allowing the calculation of the variance of exponents computed over time intervals of length

n

. The variance anomalously decays as

1/n^2

. The probability density of finite-time exponents noticeably deviates from the Gaussian shape, decaying with exponential tails and presenting

2^{n-1}

spikes that narrow and accumulate close to the mean value with increasing

n

. The asymptotic expression for this probability distribution function is derived. It provides an adequate smooth approximation to describe numerical histograms built for not too small

n

, where the finiteness of bin size trimmes the sharp peaks.Comment: 6 pages, 4 figures, to appear in Phys. Rev.

arXiv.org e-Print Archive

Crossref

Interpretation of biomechanical simulations of normal and chaotic vocal fold oscillations with empirical eigenfunctions

Author: Berry D.
Herzel H.
Krischer K.
Titze I.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1994
Field of study

Empirical orthogonal eigenfunctions are extracted from biomechanical simulations of normal and chaotic vocal fold oscillations. For normal phonation, two dominant empirical eigenfunctions capture the vibration patterns of the folds and exhibit a 1:1 entrainment. The eigenfunctions show some correspondence to theoretical low‐order normal modes of a simplified, three‐dimensional elastic continuum, and to the normal modes of a linearized two‐mass model. The eigenfunctions also facilitate a physical interpretation of energy transfer mechanisms in vocal fold dynamics. Subharmonic regimes and chaotic oscillations are observed during simulations of a lax cover, in which case at least three empirical eigenfunctions are necessary to capture the resulting vocal fold oscillations. These chaotic oscillations might be understood in terms of a desynchronization of a few of the low‐order modes, and may be related to mechanisms of creaky voice or vocal fry. Furthermore, some of the empirical eigenfunctions captured during complex oscillations correspond to higher‐order normal modes described in earlier theoretical work. The empirical eigenfunctions may also be useful in the design of lower‐order models (valid over the range for which the empirical eigenfunctions remain more or less constant), and may help facilitate bifurcation analyses of the biomechanical simulation

MPG.PuRe

Guessing probability distributions from small samples

Author: A. Apostolico
A. Schmitt
B. McMillan
Donald E. Knuth
H. Herzel
Helge Rosé
Thorsten Pöschel
W. Ebeling
W. H. Press
Werner Ebeling
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1995
Field of study

We propose a new method for the calculation of the statistical properties, as e.g. the entropy, of unknown generators of symbolic sequences. The probability distribution

p(k)

of the elements

k

of a population can be approximated by the frequencies

f(k)

of a sample provided the sample is long enough so that each element

k

occurs many times. Our method yields an approximation if this precondition does not hold. For a given

f(k)

we recalculate the Zipf--ordered probability distribution by optimization of the parameters of a guessed distribution. We demonstrate that our method yields reliable results.Comment: 10 pages, uuencoded compressed PostScrip

arXiv.org e-Print Archive

CiteSeerX

Crossref

Finite-sample frequency distributions originating from an equiprobability distribution

Author: A. O. Schmitt
C.-K. Peng
D. Holste
H. Herzel
Jan A. Freund
P. Allegrini
P. Bernaola-Galván
T. Pöschel
Thorsten Pöschel
Publication venue: 'American Physical Society (APS)'
Publication date: 19/03/2002
Field of study

Given an equidistribution for probabilities p(i)=1/N, i=1..N. What is the expected corresponding rank ordered frequency distribution f(i), i=1..N, if an ensemble of M events is drawn?Comment: 4 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Flexible web-based integration of distributed large-scale human protein interaction maps

Author: Chaurasia G.
Futschik M.E.
Haenig C.
Herzel H.
Iqbal Y.
Wanker E.E.
Publication venue: IMBio e.V.
Publication date: 26/02/2007
Field of study

Protein-protein interactions constitute the backbone of many molecular processes. This has motivated the recent construction of several large-scale human protein-protein interaction maps [1-10]. Although these maps clearly offer a wealth of information, their use is challenging: complexity, rapid growth, and fragmentation of interaction data hamper their usability. To overcome these hurdles, we have developed a publicly accessible database termed UniHI (Unified Human Interactome) for integration of human protein-protein interaction data. This database is designed to provide biomedical researchers a common platform for exploring previously disconnected human interaction maps. UniHI offers researchers flexible integrated tools for accessing comprehensive information about the human interactome. Several features included in the UniHI allow users to perform various types of network-oriented and functional analysis. At present, UniHI contains over 160,000 distinct interactions between 17,000 unique proteins from ten major interaction maps derived by both computational and experimental approaches [1-10]. Here we describe the details of the implementation and maintenance of UniHI and discuss the challenges that have to be addressed for a successful integration of interaction data

MDC Repository